Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable capture of RX features from RADE decoder #776

Merged
merged 27 commits into from
Dec 4, 2024
Merged

Conversation

tmiw
Copy link
Collaborator

@tmiw tmiw commented Nov 25, 2024

Per request from #770, this PR adds functionality to the current unit test framework to allow capture of the RX features from the RADE decoder.

Example usage:

(radae-venv) mooneer@macaron radae % ./inference.sh model19_check3/checkpoints/checkpoint_epoch_100.pth wav/brian_g8sez.wav /dev/null \
    --rate_Fs --pilots --pilot_eq --eq_ls --cp 0.004 --bottleneck 3 --auxdata --write_rx rx.f32 --correct_freq_offset
encoder: 937200 weights
decoder: 907764 weights
encoder: 937200 weights
decoder: 907764 weights
Rs: 33.33 Rs': 50.00 Ts': 0.020 Nsmf: 120 Ns:   4 Nc:  30 M: 160 Ncp: 32
/Users/mooneer/radae/./inference.py:105: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(args.model_name, map_location='cpu')
Processing: 972 feature vectors
          Eb/No   C/No     SNR3k  Rb'    Eq     PAPR
Target..: 100.00  133.01   98.24  3000
Measured:  97.45  132.22   97.45                 0.79
loss: 0.129 BER: 0.000
(radae-venv) mooneer@macaron radae % cat features_in.f32 | python3 radae_tx.py model19_check3/checkpoints/checkpoint_epoch_100.pth --auxdata | sox -t raw -e floating-point -b 32 -c 2 -r 8000 - -t wav -e signed-integer -b 16 -r 8000 -c 1 tx.wav remix 1
encoder: 937200 weights
decoder: 907764 weights
encoder: 937200 weights
decoder: 907764 weights
Rs: 33.33 Rs': 50.00 Ts': 0.020 Nsmf: 120 Ns:   4 Nc:  30 M: 160 Ncp: 32
/Users/mooneer/radae/radae_tx.py:71: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  checkpoint = torch.load(args.model_name, map_location='cpu')
sox WARN dither: dither clipped 8 samples; decrease volume?
(radae-venv) mooneer@macaron radae % ~/freedv-gui/build_osx/src/FreeDV.app/Contents/MacOS/FreeDV -ut rx -utmode RADE -rxfile ~/radae/tx.wav -featurefile ~/radae/features_out.f32
08:41:16 INFO /Users/mooneer/freedv-gui/src/main.cpp:400: FreeDV version 2.0.0-devel-1fa572ed starting
[snipped]
(radae-venv) mooneer@macaron radae % python3 loss.py features_in.f32 features_out.f32 --loss_test 0.2 --acq_time_test 1.0 --plot
torch.Size([1, 982, 20]) torch.Size([1, 936, 20])
Loss between features_in.f32 and features_out.f32
  loss: 0.132 start: 36 acq_time:  0.36 s
36 900
PASS
(radae-venv) mooneer@macaron radae %

Plot from above:

Screenshot 2024-11-25 at 8 42 03 AM

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 25, 2024

Pinging @drowe67. Does this look like the feature you're looking for?

@drowe67
Copy link
Owner

drowe67 commented Nov 25, 2024

@tmiw - I think so but there's quite a bit going on above ☝️, can you pls snip the irrelevant bits and make it clearer? I'm trying to see which features.f32 file come from which program. Might be useful to separate them rather have one big line. Screen shots of the --plot graphs are also useful.

I think I've sorted out that annoying warning in drowe67/radae#33.

Looks like good progress 👍 - this could be very useful 🙂

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 25, 2024

@tmiw - I think so but there's quite a bit going on above ☝️, can you pls snip the irrelevant bits and make it clearer? I'm trying to see which features.f32 file come from which program. Might be useful to separate them rather have one big line. Screen shots of the --plot graphs are also useful.

I updated the description above. Hopefully that helps?

@drowe67
Copy link
Owner

drowe67 commented Nov 25, 2024

@tmiw - yes that's looking better. A couple of suggestions:

  1. inference.py is already generating rx.f32, so you don't need to run radae_txe.py. It also outputs features_out.f32, which you can use as a "reference Rx". Or you could run radae_rxe.py to generate a reference features_xx.f32 which has all the state machine and acquisition stuff in it so is perhaps more representative as a reference Rx.
  2. With the ota_test.sh development work I got into quite a bit of trouble using sox with remix to do the complex float to real conversion, it has some internal processing steps that corrupted the RADE signal like limiting floats to +/-1. So I'm using my own tools these days, e.g. cat ${tx_radae}.f32 | python3 f32toint16.py --real --scale 16383 > ${tx_radae}.raw (then use sox for the raw->wav)
  3. loss.py has some handy options to compare one input and two output feature files, and plot them, so you could compare the loss from say a reference Rx like radae_rxe.py and freedv-gui. Useful to look for any big spikes in loss part way thru, consistently higher loss, or other weirdness.

Reminder to self - clean up spurious prints in RADE tools! 🙂

@drowe67
Copy link
Owner

drowe67 commented Nov 25, 2024

@tmiw - can the freedv-tx run in CLI mode?

  1. Here's the test use-case I have in mind:
mooneer.wav -> freedv_gui -> sound card -> audio cable -> sound card -> freedv_gui -> features_freedvgui.f32
mooneer.wav -> inference.sh -> features_in.f32,features_out.f32
loss.py features_in.f32 features_out.f32 ----features_hat2 features_freedvgui.f32

where mooneer.wav is a recording of you having a regular "over" for say 15-30s. This will really let us run to ground the effects of the freedv-gui audio processing, and make sure nothing is messing with the RADE quality.

  1. If that works (ie loss delta is small), "audio cable" can be replaced by a high SNR RF link (e.g. a nearby Rx), to see if the path through the radio, PA etc is messing with the loss. I'm concerned about what happens to the RADE signal (and PAPR) of the waveform through a real world Tx and PA, and if that is inducing any of the AM type "digital" artifact you have mentioned.

  2. Other experiments - try a walter.wav with his favorite Ham mic then a gamer headset. Try with and without external adjustment of freq response (like an external EQ box some Hams like to use).

These are just brain-storms atm, at some stage I'll formalise this with a test plan and we can get others involved (e.g. the August 2024 test team).

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 26, 2024

  1. inference.py is already generating rx.f32, so you don't need to run radae_txe.py. It also outputs features_out.f32, which you can use as a "reference Rx". Or you could run radae_rxe.py to generate a reference features_xx.f32 which has all the state machine and acquisition stuff in it so is perhaps more representative as a reference Rx.
  2. With the ota_test.sh development work I got into quite a bit of trouble using sox with remix to do the complex float to real conversion, it has some internal processing steps that corrupted the RADE signal like limiting floats to +/-1. So I'm using my own tools these days, e.g. cat ${tx_radae}.f32 | python3 f32toint16.py --real --scale 16383 > ${tx_radae}.raw (then use sox for the raw->wav)

When I tried converting rx.f32 directly to WAV using your script, I ended up with the following error in loss.py after FreeDV finished:

torch.Size([1, 982, 20]) torch.Size([1, 972, 20])
Loss between features_in.f32 and features_out_ref.f32
  loss: 0.129 start: 0 acq_time:  0.00 s
0 972
PASS
torch.Size([1, 982, 20]) torch.Size([1, 1020, 20])
Traceback (most recent call last):
  File "/Users/mooneer/radae/loss.py", line 108, in <module>
    min_loss2, min_start2, loss2 = find_loss(args.features, args.features_hat2)
  File "/Users/mooneer/radae/loss.py", line 72, in find_loss
    assert features_hat_seq_length <= features_seq_length
AssertionError

Not sure what's going on there.

  1. loss.py has some handy options to compare one input and two output feature files, and plot them, so you could compare the loss from say a reference Rx like radae_rxe.py and freedv-gui. Useful to look for any big spikes in loss part way thru, consistently higher loss, or other weirdness.

Cool. 👍

Just did the following:

(radae-venv) mooneer@macaron radae % ./inference.sh model19_check3/checkpoints/checkpoint_epoch_100.pth wav/brian_g8sez.wav /dev/null \                                                 
    --rate_Fs --pilots --pilot_eq --eq_ls --cp 0.004 --bottleneck 3 --auxdata --write_rx rx.f32 --correct_freq_offset
[...]
(radae-venv) mooneer@macaron radae % mv features_out.f32 features_out_ref.f32 
(radae-venv) mooneer@macaron radae % cat features_in.f32 | python3 radae_tx.py model19_check3/checkpoints/checkpoint_epoch_100.pth --auxdata | python3 f32toint16.py --real --scale 16383 |  sox -t raw -e signed-integer -b 16 -c 1 -r 8000 - -t wav -e signed-integer -b 16 -r 8000 -c 1 tx.wav
[...]
(radae-venv) mooneer@macaron radae % ~/freedv-gui/build_osx/src/FreeDV.app/Contents/MacOS/FreeDV -ut rx -utmode RADE -rxfile ~/radae/tx.wav -featurefile ~/radae/features_out_freedv.f32
[...]
(radae-venv) mooneer@macaron radae % python3 loss.py features_in.f32 features_out_ref.f32 --loss_test 0.2 --acq_time_test 1.0 --plot --features_hat2 features_out_freedv.f32            
torch.Size([1, 982, 20]) torch.Size([1, 972, 20])
Loss between features_in.f32 and features_out_ref.f32
  loss: 0.129 start: 0 acq_time:  0.00 s
0 972
PASS
torch.Size([1, 982, 20]) torch.Size([1, 936, 20])
Loss between features_in.f32 and features_out_freedv.f32
  loss: 0.131 start: 36 acq_time:  0.36 s
36 900
(radae-venv) mooneer@macaron radae %

with plots below:

Screenshot 2024-11-25 at 5 05 14 PM
Screenshot 2024-11-25 at 5 05 12 PM

@tmiw - can the freedv-tx run in CLI mode?

I don't think so but on Linux at least, you can use Xvfb (and in fact, that's what the GitHub Action does).

  1. Here's the test use-case I have in mind:
mooneer.wav -> freedv_gui -> sound card -> audio cable -> sound card -> freedv_gui -> features_freedvgui.f32
mooneer.wav -> inference.sh -> features_in.wav,features_out.wav
loss.py features_in.wav features_out.wav ----features_hat2 features_freedvgui.f32

where mooneer.wav is a recording of you having a regular "over" for say 15-30s. This will really let us run to ground the effects of the freedv-gui audio processing, and make sure nothing is messing with the RADE quality.

The -rxfile option for FreeDV basically does the same thing as if you went to Tools->Start Play File-From Radio immediately after startup. We should still be able to capture the features using the audio cable loopback, but I'm not sure how to correlate with whatever "microphone" audio is being fed in (plus there's currently no way to vary how long the capture is--it'll always be ~1 minute long).

@drowe67
Copy link
Owner

drowe67 commented Nov 26, 2024

The plots looks good, no huge variation. Note the features_out_freedv.f32 is shorter, as it has to acquire the signal. The features_out.f32 from inference.sh assumes ideal (instant) acquisition. The blue spike at the start is the decoder converging from it's starting state.

Not sure what's going on there.

I find it useful to plot the waveforms to get to the bottom of these sorts of bugs. I have Octave load_f32, load_raw functions to help, or Audacity can plot wave files.

This worked for me (top two lines generate tx.wav, the rest is a reference receiver to check tx.wav):

./inference.sh model19_check3/checkpoints/checkpoint_epoch_100.pth wav/brian_g8sez.wav /dev/null --rate_Fs --pilots --pilot_eq --eq_ls --cp 0.004 --bottleneck 3 --auxdata --write_rx rx.f32
cat rx.f32 | python3 f32toint16.py --real --scale 16383 | sox -t .s16 -r 8000 -c 1 - tx.wav

cat tx.wav | python3 int16tof32.py --zeropad | python3 radae_rxe.py -v 1 > features_rx_out.f32
python3 loss.py features_in.f32 features_out.f32 --features_hat2 features_rx_out.f32
Loss between features_in.f32 and features_out.f32
  loss: 0.128 start: 0 acq_time:  0.00 s
Loss between features_in.f32 and features_rx_out.f32
  loss: 0.129 start: 36 acq_time:  0.36 s

The -rxfile option for FreeDV basically does the same thing as if you went to Tools->Start Play File-From Radio immediately after startup. We should still be able to capture the features using the audio cable loopback, but I'm not sure how to correlate with whatever "microphone" audio is being fed in (plus there's currently no way to vary how long the capture is--it'll always be ~1 minute long).

These problems won't be hard to resolve if we really want to perform these experiments. You need to feed the same wavefile into the freedv-gui Tx as the reference Tx (e.g. mooneer.wav). The freedv-gui features.f32 capture should start/stop on Rx sync. Have a think about it, with the use case I've described above as an example 🤔 These tests don't have to be super automated - they are for human-driven real world experiments not GitHub actions, e.g. some manual GUI clicking is OK.

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 27, 2024

These problems won't be hard to resolve if we really want to perform these experiments. You need to feed the same wavefile into the freedv-gui Tx as the reference Tx (e.g. mooneer.wav). The freedv-gui features.f32 capture should start/stop on Rx sync. Have a think about it, with the use case I've described above as an example 🤔 These tests don't have to be super automated - they are for human-driven real world experiments not GitHub actions, e.g. some manual GUI clicking is OK.

I added a new option for TX feature capture:

Usage: FreeDV [-h] [--verbose] [-f <str>] [-ut <str>] [-utmode <str>] [-rxfile <str>] [-rxfeaturefile <str>] [-txfeaturefile <str>]
  -h, --help            	show this help message
  --verbose             	generate verbose log messages
  -f, --config=<str>    	Use different configuration file instead of the default.
  -ut, --unit_test=<str>	Execute FreeDV in unit test mode.
  -utmode:<str>         	Switch FreeDV to the given mode before UT execution.
  -rxfile:<str>         	In UT mode, pipes given WAV file through receive pipeline.
  -rxfeaturefile:<str>  	Capture RX features from RADE decoder into the provided file.
  -txfeaturefile:<str>  	Capture TX features from FARGAN encoder into the provided file.

Also, both -rxfeaturefile and -txfeaturefile don't require -ut, so they can be run manually. I'm imagining something like this:

  1. Execute FreeDV with -rxfeaturefile and -txfeaturefile pointing to the appropriate places.
  2. Enable full duplex mode.
  3. Press Start, talk into the microphone for ~30 seconds, then push Stop.
  4. Process the RX and TX features accordingly to get loss figures, etc.

@drowe67
Copy link
Owner

drowe67 commented Nov 27, 2024

Interesting idea - but not quite what we need for the test use case I described. We want to feed an input wave file (instead of the mic input) to freedv-gui, and have it pass through the entire system. This might include a RF link instead of the audio cable. The Tx and Rx could be at separate sites (so no full duplex).

The use of a fixed, constant input wave file allows us to repeat the test, for example changing RF drive levels or mic filtering in freedv-gui. It also tests almost all of the freedv-gui audio processing, so will trap any subtle bugs.

features_in.f32 could come from a CLI version of the encoder, as per the test use case example above - we don't want to use freedv-gui to do the encoding for the reference sample as the input audio processing is part of what we are testing.

There may be some other tests we could perform with the -txfeaturefile feature (e.g. different mics), but the key to experiments is repeatability, and that will be hard to do with a new recording every time.

Note: I edited the test use case example above, had .wav instead of .f32 for some of the files.

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 28, 2024

Interesting idea - but not quite what we need for the test use case I described. We want to feed an input wave file (instead of the mic input) to freedv-gui, and have it pass through the entire system. This might include a RF link instead of the audio cable. The Tx and Rx could be at separate sites (so no full duplex).

The latest commit adds a -txfile argument that does this. For example, after configuring for full duplex using a USB device with mic and speaker joined together (and disabling speex):

Mooneers-16-MacBook-Pro-16158:build_osx mooneer$ src/FreeDV.app/Contents/MacOS/FreeDV \
    -f `pwd`/freedv-fdx.conf -ut tx -utmode RADE \
    -txfile ~/Downloads/FreeDV\ Voice\ Keyer\ -\ 16\ kHz.wav \
    -rxfeaturefile `pwd`/features_out_freedv.f32

Then from the RADE repo:

(radae-venv) MacBookPro:radae mooneer$ cp ~/Downloads/FreeDV\ Voice\ Keyer\ -\ 16\ kHz.wav vk.wav # since inference.sh doesn't like filenames with spaces
(radae-venv) MacBookPro:radae mooneer$ ./inference.sh model19_check3/checkpoints/checkpoint_epoch_100.pth vk.wav /dev/null \
    --rate_Fs --pilots --pilot_eq --eq_ls --cp 0.004 --bottleneck 3 \
    --auxdata --write_rx rx.f32 --correct_freq_offset
(radae-venv) MacBookPro:radae mooneer$ mv features_out.f32 features_out_ref.f32
(radae-venv) MacBookPro:radae mooneer$ python3 loss.py features_in.f32 features_out_ref.f32 \
    --loss_test 0.2 --acq_time_test 1.0 --plot \
    --features_hat2 ~/devel/freedv-gui/build_osx/features_out_freedv.f32
Loss between features_in.f32 and features_out_ref.f32
  loss: 0.187 start: 0 acq_time:  0.00 s
PASS
Loss between features_in.f32 and /Users/mooneer/devel/freedv-gui/build_osx/features_out_freedv.f32
  loss: 0.350 start: 48 acq_time:  0.48 s

Associated graphs below:

Screenshot 2024-11-27 at 11 46 41 PM Screenshot 2024-11-27 at 11 46 32 PM

Anyway, I'll double check the RADE transmit logic again as RX seemed okay per the above comments.

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 28, 2024

I adjusted the scaling and that helped a tiny bit:

Loss between features_in.f32 and features_out_ref.f32
  loss: 0.187 start: 0 acq_time:  0.00 s
PASS
Loss between features_in.f32 and /Users/mooneer/devel/freedv-gui/build_osx/features_out_freedv.f32
  loss: 0.343 start: 48 acq_time:  0.48 s

(0.343 instead of 0.350)

I wonder if the sample rate adjustments made in the TX pipeline are messing things up, too. Based on my local config and the source code, it seems we go from 16 kHz (the input WAV file) -> 48 kHz -> 16 kHz (for feeding into the RADE TX), then 8 kHz -> 48 kHz for actually transmitting the signal. In theory it shouldn't, but you never know.

We're also applying 4 dB additional attenuation to the RADE signal before TX to avoid users having to adjust the TX Attenuation dial when going back and forth between 700D/E/etc. and RADE, though I wouldn't think that'd mess anything up. The attenuation step is added as follows:

        auto txAttenuationStep = new LevelAdjustStep(outputSampleRate_, []() {
            double dbLoss = g_txLevel / 10.0;
            
            if (freedvInterface.getTxMode() == FREEDV_MODE_RADE)
            {
                // Attenuate by 4 dB as there's no BPF; anything louder distorts the signal
                dbLoss -= 4.0;
            }
            
            double scaleFactor = exp(dbLoss/20.0 * log(10.0));
            return scaleFactor; 
        });
        pipeline_->appendPipelineStep(std::shared_ptr<IPipelineStep>(txAttenuationStep));

and scaling is basically:

    double scaleFactor = scaleFactorFn_();

    for (int index = 0; index < numInputSamples; index++)
    {
        outputSamples[index] = inputSamples.get()[index] * scaleFactor;
    }

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 28, 2024

OK, significant improvement after going from fastest to medium quality in libsamplerate:

Loss between features_in.f32 and features_out_ref.f32
  loss: 0.187 start: 0 acq_time:  0.00 s
PASS
Loss between features_in.f32 and /Users/mooneer/devel/freedv-gui/build_osx/features_out_freedv.f32
  loss: 0.228 start: 48 acq_time:  0.48 s

Graphs:

Screenshot 2024-11-28 at 12 21 52 AM Screenshot 2024-11-28 at 12 21 49 AM

I tried using SRC_SINC_BEST_QUALITY in the libsamplerate config but that got me a lot of dropouts on RX, so we're using SRC_SINC_MEDIUM_QUALITY for now. IIRC Annaliese from FlexRadio was using another resampling library for the original FreeDV waveform that was supposedly faster/better quality, so it may be worth investigating that further.

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 28, 2024

I tried master from libsamplerate (as it had some newer changes that enable some SIMD usage on x86) and I was able to use SRC_SINC_BEST_QUALITY. That didn't seem to improve things much, though, vs. medium quality:

(radae-venv) MacBookPro:radae mooneer$ python3 loss.py features_in.f32 features_out_ref.f32 --loss_test 0.2 --acq_time_test 1.0 --plot --features_hat2 ~/devel/freedv-gui/build_osx/features_out_freedv.f32
Loss between features_in.f32 and features_out_ref.f32
  loss: 0.187 start: 0 acq_time:  0.00 s
PASS
Loss between features_in.f32 and /Users/mooneer/devel/freedv-gui/build_osx/features_out_freedv.f32
  loss: 0.206 start: 24 acq_time:  0.24 s

(for comparison, medium quality had a loss of 0.228)

BTW I'm fairly sure it was SoX that was being used in the original FreeDV waveform: https://sourceforge.net/p/soxr/wiki/Home/. Is it worth investigating using that instead of libsamplerate?

@drowe67
Copy link
Owner

drowe67 commented Nov 29, 2024

Interesting!

  1. The freedv-gui command line test framework is coming along well and is showing it's value - well done 👍
  2. The big difference (0.350) indicates they are different ... but not necessarily in a bad way, e.g. there might just be some gentle filtering. Or - there could be a nasty bug. So on balance I think that it's best the path through freedv-gui has a transfer function of 1 - i.e. does not "color" the signal in any way.
  3. Try using radae_txe.py/radae_rxe.py to get the reference rather than inference.sh, as this rx includes the noise from pilot symbol estimators, and is the same thing you are running inside freedv-gui, which leaves only the audio chain as the DUT.
  4. Be cool to run this over a real radio link (at high SNR). I'm wondering if Hams messing the Tx/Rx filtering, drive levels & real world PAs also messes up the loss.
  5. It would be useful to know the freq response of those resamplers, they are generally low pass filters of some kind. If they are busted then you might get some aliasing which is non-linear kind of noise. If you can wind up the number of taps you might see loss go down as they become more "brick wall".
  6. That initial spike in the loss curve might have a fairly large contribution to the mean loss. I can't recall if there is an option to clip that off the test data.
  7. Make sure output modulated waveform is not clipping despite 4dB gain shift. loss should be insensitive to magnitude of tx signal, as it's trained over multipath channels and has gain control via pilots. You could check this by varying drive and measuring loss.

@drowe67
Copy link
Owner

drowe67 commented Nov 29, 2024

This is just a brainstorm for future consideration (no need to start coding), but a neat way to tune over the air:

  1. freedv-gui accepts an input ref feature files derived from a reference input wavefile.
  2. Start up the GUI, press Start to start Rx running.
  3. The operator kicks off a Tx somehow, using the canned wave file.
  4. When it freedv-gui receives an over it compares the loss of the received features to the ref.
  5. And automagically prints out the loss (it's a simple mean square error)
  6. That way the operator can mess with various radio settings and Tx the same canned wavefile every time to see the effect.

A very neat side effect of the ML work is we actually have a useful objective measure of speech quality. Never had that before.

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 30, 2024

I went ahead and tried switching to soxr along with regenerating the reference per suggestion:

(radae-venv) MacBookPro:radae mooneer$ sox vk.wav -t raw - | ./build/src/lpcnet_demo -features - - > features_in.f32 
...
(radae-venv) MacBookPro:radae mooneer$ sox vk.wav -t raw - | ./build/src/lpcnet_demo -features - - | python3 radae_txe.py | python3 radae_rxe.py > features_out_ref.f32
...
(radae-venv) MacBookPro:radae mooneer$ python3 loss.py features_in.f32 features_out_ref.f32 --loss_test 0.2 --acq_time_test 1.0 --plot --features_hat2 ~/devel/freedv-gui/build_osx/features_out_freedv.f32
Loss between features_in.f32 and features_out_ref.f32
  loss: 0.187 start: 36 acq_time:  0.36 s
PASS
Loss between features_in.f32 and /Users/mooneer/devel/freedv-gui/build_osx/features_out_freedv.f32
  loss: 0.215 start: 48 acq_time:  0.48 s

The loss figure above doesn't seem that much different from libsamplerate at medium quality but the plots seem to show a lot smaller differences between the peaks:

Screenshot 2024-11-30 at 10 25 38 AM Screenshot 2024-11-30 at 10 25 36 AM

(I also tried VHQ quality in soxr but that didn't work properly for some reason.)

That initial spike in the loss curve might have a fairly large contribution to the mean loss. I can't recall if there is an option to clip that off the test data.

--clip_start seems to be the right option for this:

(radae-venv) MacBookPro:radae mooneer$ python3 loss.py features_in.f32 features_out_ref.f32 --loss_test 0.2 --acq_time_test 1.0 --plot --features_hat2 ~/devel/freedv-gui/build_osx/features_out_freedv.f32 --clip_start 15
Loss between features_in.f32 and features_out_ref.f32
  loss: 0.186 start: 51 acq_time:  0.51 s
PASS
Loss between features_in.f32 and /Users/mooneer/devel/freedv-gui/build_osx/features_out_freedv.f32
  loss: 0.205 start: 63 acq_time:  0.63 s
Screenshot 2024-11-30 at 10 30 01 AM Screenshot 2024-11-30 at 10 29 59 AM

@tmiw
Copy link
Collaborator Author

tmiw commented Nov 30, 2024

BTW I also tried setting the "analog" audio device sample rates to 16000 prior to running the tests and got the following:

(radae-venv) MacBookPro:radae mooneer$ python3 loss.py features_in.f32 features_out_ref.f32 --loss_test 0.2 --acq_time_test 1.0 --plot --features_hat2 ~/devel/freedv-gui/build_osx/features_out_freedv.f32 --clip_start 15
Loss between features_in.f32 and features_out_ref.f32
  loss: 0.186 start: 51 acq_time:  0.51 s
PASS
Loss between features_in.f32 and /Users/mooneer/devel/freedv-gui/build_osx/features_out_freedv.f32
  loss: 0.195 start: 63 acq_time:  0.63 s

which doesn't seem much different than the 0.205 above. I'd say resampling likely isn't much of a factor now.

@drowe67
Copy link
Owner

drowe67 commented Nov 30, 2024

which doesn't seem much different than the 0.205 above. I'd say resampling likely isn't much of a factor now.

Yes I agree. A few thoughts:

  1. We're still gaining experience with the loss tool so hard to say exactly what difference we would expect. But I think it's fair to say it showed some big differences when using the previous resampler.
  2. There may be better ways to test the signal processing chain, e.g. put a sine wave through it, make sure there are no sample slips, define a "mask" (e.g. -3dB points, -20dB points) for the input audio.
  3. The reference loss (0.18) you are getting for that sample are rather large (compared to some other samples), I'm wondering if it's something to do with the input sample, microphone, level, background noise, your voice (!) etc. We still have much to learn about the "care and feeding" of RADE - but nice to have a tool to help.

@drowe67
Copy link
Owner

drowe67 commented Nov 30, 2024

@tmiw - I messed around with wav/mooneer.wav and it also has a higher loss compared to many other samples in wav. The sample sounds to me like it has some echo or reverb. On Audacity using select-all and Analyze-Plot Spectrum I can see a low pass response. Suggest you try a gamer style headset (wired) and see if that helps.

I'm using:

sox wav/mooneer.wav -t raw - | ./build/src/lpcnet_demo -features - - | tee features_in.f32 | python3 radae_txe.py | python3 radae_rxe.py -v 1 | tee features_out.f32 | ./build/src/lpcnet_demo -fargan-synthesis - - | aplay -r 16000 -f S16_LE; python3 loss.py features_in.f32 features_out.f3

To process, listen, measure and plot the loss.

Screenshot from 2024-12-01 07-59-08

@tmiw
Copy link
Collaborator Author

tmiw commented Dec 2, 2024

@tmiw - I messed around with wav/mooneer.wav and it also has a higher loss compared to many other samples in wav. The sample sounds to me like it has some echo or reverb. On Audacity using select-all and Analyze-Plot Spectrum I can see a low pass response. Suggest you try a gamer style headset (wired) and see if that helps.

I don't have a headset immediately on me, but I did rerecord the file I used for previous tests with an XLR microphone and my Scarlett Solo USB interface. This seemed to turn out better:

Screenshot 2024-12-02 at 1 26 40 AM

Loss figures:

Loss between features_in.f32 and features_out_ref.f32
  loss: 0.091 start: 51 acq_time:  0.51 s
PASS

(I can't currently test via loopback, unfortunately, but you can grab the file here (zipped due to GH not allowing direct uploads of .wav): FreeDV Voice Keyer - 16 kHz.wav.zip)

@drowe67
Copy link
Owner

drowe67 commented Dec 2, 2024

Wow! That's a big drop in loss. The input sample sounds a lot nicer to me than wav/mooneer.wav - lots more high freq. That spectrum plot is similar in shape to several other samples in radae/wav. I'm not exactly sure how to interpret these plots as they are the average of several seconds of speech shaped by the microphone/room frequency response. Hopefully not shaped by freedv-gui any more, or not much.

Another brainstorm: a tool/process for end users to test their microphone, e.g. record a sample, RADE enc/dec it (the modem wouldn't need to be in the path), print out a loss. BTW not suggesting we code this now, just 🤔 out loud...(maybe I should move this to an Issue). IIRC we have a similar metric from the 700D mic eq printed out on freedv-gui.

One of the problems with general purpose Ham applications is the wide variety of input microphones, and custom filtering arrangements. Icom know what the freq response is of their microphone so can get consistent results.

In the ML world - another way to handle this is train with a variety of spectral shaping.

Other idea: we should be documenting this in a "RADE V1 user guide". e.g. use mics without reverb, check the loss

@drowe67
Copy link
Owner

drowe67 commented Dec 2, 2024

I tried running the XLR sample through the RADE stack and the background noise on the input sample turns into an annoying frame rate buzzing 🤷 . Still much to learn about RADE tuning I guess.

@tmiw
Copy link
Collaborator Author

tmiw commented Dec 3, 2024

Hmm, I fixed the bug that was causing VHQ mode not to work (and HQ mode to apparently not work either on the M1 Mac Mini) and now loss in both resampling modes is a lot higher (there seemed to be no difference between VHQ and HQ). From the file I uploaded above:

Loss between features_in.f32 and features_out_ref.f32
  loss: 0.091 start: 51 acq_time:  0.51 s
PASS
Loss between features_in.f32 and features_out_freedv.f32
  loss: 4.932 start: 0 acq_time:  0.00 s

Screenshot 2024-12-03 at 12 32 36 AM

The audio coming from freedv-gui still seems to sound okay so I'm not sure what's going on. It could be some sort of bug in how I'm capturing the RX features or maybe something else. Unfortunately my laptop (where I did the previous testing) had an unfortunate water-based incident so it has to wait until it comes back from Apple to confirm the results there.

@drowe67
Copy link
Owner

drowe67 commented Dec 3, 2024

@tmiw - what was the engineering justification for the move to soxr? I'm concerned that we are running down a rabbit hole here based on an ad-hoc change.

If you really want to run this resampler business to ground I can help you come up with some signal processing based tests to set appropriate "goal posts" for the work.

@tmiw
Copy link
Collaborator Author

tmiw commented Dec 3, 2024

@tmiw - what was the engineering justification for the move to soxr? I'm concerned that we are running down a rabbit hole here based on an ad-hoc change.

If you really want to run this resampler business to ground I can help you come up with some signal processing based tests to set appropriate "goal posts" for the work.

soxr promises lower CPU usage and at least a little bit of an increase in audio quality. That said, a lot of the sampler related degregation did resolve itself when I increased quality to medium on libsamplerate.

Anyway, I was able to get the ctests working but haven't been able to push the changes yet. The problem is that I'm still getting higher loss than expected but I don't know if it's something with the machine I'm using (I did previous testing on a x86 laptop, the more recent testing is on an ARM machine) or a bug that got introduced with a recent commit. I'll see if I can spend a bit more time chasing it down and if I don't get anywhere, I'll spin off the soxr changes into a separate PR and come back to them later.

@drowe67
Copy link
Owner

drowe67 commented Dec 4, 2024

soxr promises lower CPU usage and at least a little bit of an increase in audio quality.

What evidence can you point me at to support the claim of increased quality? AFAIK there are no systematic tests of resamplers atm?

CPU load in resamplers is not what we should be pursing atm - in fact the results above with low resolution libsrc re-samplers suggest optimisation it is something we should avoid until we really know what we are doing (ie tests that systematically test the resamplers and alert us of any subtle issues). We have been here before - premature and unwarranted optimisation will cause us problems.

We need to focus on the lowest risk approach to a bug free freedv-gui signal processing path to support RADE.

@tmiw
Copy link
Collaborator Author

tmiw commented Dec 4, 2024

What evidence can you point me at to support the claim of increased quality? AFAIK there are no systematic tests of resamplers atm?

I was using the loss figures from loss.py as a basis for comparison.

We need to focus on the lowest risk approach to a bug free freedv-gui signal processing path to support RADE.

I checked out ce8c1ef (the final one before the introduction of soxr) and tried running the tests I've been doing again. On this machine at least, there didn't seem to be much difference in terms of the loss figure from loss.py vs. with soxr, so I went ahead and reverted those changes. Once the GH actions are done, I'll go ahead and merge and start another PR with the changes after that commit for future investigation.

@tmiw tmiw merged commit ce76e7f into v2.0-dev Dec 4, 2024
4 checks passed
if (freedvInterface.getTxMode() == FREEDV_MODE_RADE)
{
// Attenuate by 4 dB as there's no BPF; anything louder distorts the signal
dbLoss -= 4.0;
}
#endif // 0

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The level of the RADE signal should be set at the float to int conversion stage (correctly done above ☝️), not tweaked later in the Tx signal processing. This code should be rm-ed, not just if-deffed out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 2c3f6a2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants